TNO-UT at TREC-9: How Different are Web Documents?

نویسندگان

  • Wessel Kraaij
  • Thijs Westerveld
چکیده

Although at first sight, the web track might seem a copy of the ad hoc track, we discovered that some small adjustments had to be made to our systems to run the web evaluation. As we expected, the basic language model based IR model worked effectively on this data. Blind feedback methods however, seem less effective on web data. We also experimented with rescoring the documents based on several algorithms that exploit link information. These methods yielded no positive result.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Relevance Feedback as an Indicator to Select the Best Search Engine - Evaluation on TREC Data

This paper explores information retrieval system variability and takes advantage of the fact two systems can retrieve different documents for a given query. More precisely, our approach is based on data fusion (fusion of system results) by taking into account local performances of each system. Our method considers the relevance of the very first documents retrieved by different systems and from...

متن کامل

Overview of the TREC-8 Web Track

The TREC-8 Web Track defined ad hoc retrieval tasks over the 100 gigabyte VLC2 collection (Large Web Task) and a selected 2 gigabyte subset known as WT2g (Small Web Task). Here, the guidelines and resources for both tasks are described and results presented and analysed. Performance on the Small Web was strongly correlated with performance on the regular TREC Ad Hoc task. Little benefit was der...

متن کامل

ICTNET at Federated Web Search Track 2013

This paper is about work done for result merging task of TREC 2013 Federated Web Track. We introduce three methods for calculating score of documents. These methods are based on linear combination with scores of document fields. The distinction is different weight factors. Score of base line method is the combination with score of basic html fields. Page rank score is added in second method. Do...

متن کامل

Overview of the TREC 2011 Web Track

The TREC Web Track explores and evaluates Web retrieval technology over large collections of Web data. In its current incarnation, the Web Track has been active since TREC 2009, where it included both a traditional adhoc retrieval task and a new diversity task [4]. The goal of this diversity task is to return a ranked list of pages that together provide complete coverage for a query, while avoi...

متن کامل

Cross Language Information Retrieval for Biomedical Literature

This workshop report discusses the collaborative work of UT, EMC and TNO on the TREC Genomics Track 2007. The biomedical information retrieval task is approached using cross language methods, in which biomedical concept detection is combined with effective IR based on unigram language models. Furthermore, a co-occurrence method is used to select and filter candidate answers. On its own, the cro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000